AITopics | weight initialization method

Collaborating Authors

weight initialization method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reducing Oversmoothing through Informed Weight Initialization in Graph Neural Networks

Kelesis, Dimitrios, Fotakis, Dimitris, Paliouras, Georgios

arXiv.org Artificial IntelligenceOct-31-2024

In this work, we generalize the ideas of Kaiming initialization to Graph Neural Networks (GNNs) and propose a new scheme (G-Init) that reduces oversmoothing, leading to very good results in node and graph classification tasks. GNNs are commonly initialized using methods designed for other types of Neural Networks, overlooking the underlying graph topology. We analyze theoretically the variance of signals flowing forward and gradients flowing backward in the class of convolutional GNNs. We then simplify our analysis to the case of the GCN and propose a new initialization method. Our results indicate that the new method (G-Init) reduces oversmoothing in deep GNNs, facilitating their effective use. Experimental validation supports our theoretical findings, demonstrating the advantages of deep networks in scenarios with no feature information for unlabeled nodes (i.e., ``cold start'' scenario).

initialization method, representation, variance, (13 more...)

arXiv.org Artificial Intelligence

2410.2383

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis

Lee, Hyunwoo, Choi, Hayoung, Kim, Hyunju

arXiv.org Artificial IntelligenceOct-3-2024

As a neural network's depth increases, it can achieve strong generalization performance. Training, however, becomes challenging due to gradient issues. Theoretical research and various methods have been introduced to address this issues. However, research on weight initialization methods that can be effectively applied to tanh neural networks of varying sizes still needs to be completed. This paper presents a novel weight initialization method for Feedforward Neural Networks with tanh activation function. Based on an analysis of the fixed points of the function $\tanh(ax)$, our proposed method aims to determine values of $a$ that prevent the saturation of activations. A series of experiments on various classification datasets demonstrate that the proposed method is more robust to network size variations than the existing method. Furthermore, when applied to Physics-Informed Neural Networks, the method exhibits faster convergence and robustness to variations of the network size compared to Xavier initialization in problems of Partial Differential Equations.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2410.02242

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification

Boulila, Wadii, Alshanqiti, Eman, Alzahem, Ayyub, Koubaa, Anis, Mlaiki, Nabil

arXiv.org Artificial IntelligenceJun-1-2024

The growing interest in satellite imagery has triggered the need for efficient mechanisms to extract valuable information from these vast data sources, providing deeper insights. Even though deep learning has shown significant progress in satellite image classification. Nevertheless, in the literature, only a few results can be found on weight initialization techniques. These techniques traditionally involve initializing the networks' weights before training on extensive datasets, distinct from fine-tuning the weights of pre-trained networks. In this study, a novel weight initialization method is proposed in the context of satellite image classification. The proposed weight initialization method is mathematically detailed during the forward and backward passes of the convolutional neural network (CNN) model. Extensive experiments are carried out using six real-world datasets. Comparative analyses with existing weight initialization techniques made on various well-known CNN models reveal that the proposed weight initialization technique outperforms the previous competitive techniques in classification accuracy. The complete code of the proposed technique, along with the obtained results, is available at https://github.com/WadiiBoulila/Weight-Initialization

initialization, initialization method, weight initialization method, (15 more...)

arXiv.org Artificial Intelligence

2406.00348

Country:

Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)
Africa > Middle East > Tunisia > Manouba Governorate > Manouba (0.04)
(6 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Education > Educational Setting > Online (0.81)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improved weight initialization for deep and narrow feedforward neural network

Lee, Hyunwoo, Kim, Yunho, Yang, Seungyeop, Choi, Hayoung

arXiv.org Artificial IntelligenceNov-7-2023

Appropriate weight initialization settings, along with the ReLU activation function, have been a cornerstone of modern deep learning, making it possible to train and deploy highly effective and efficient neural network models across diverse artificial intelligence. The problem of dying ReLU, where ReLU neurons become inactive and yield zero output, presents a significant challenge in the training of deep neural networks with ReLU activation function. Theoretical research and various methods have been introduced to address the problem. However, even with these methods and research, training remains challenging for extremely deep and narrow feedforward networks with ReLU activation function. In this paper, we propose a new weight initialization method to address this issue. We prove the properties of the proposed initial weight matrix and demonstrate how these properties facilitate the effective propagation of signal vectors. Through a series of experiments and comparisons with existing methods, we demonstrate the effectiveness of the new initialization method.

initialization, initialization method, neural network, (15 more...)

arXiv.org Artificial Intelligence

2311.03733

Country:

Asia > South Korea > Ulsan > Ulsan (0.04)
Asia > South Korea > Daegu > Daegu (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult

Chang, Wen-Yu, Lin, Tsung-Nan

arXiv.org Machine LearningOct-21-2019

It is well known that the problem of vanishing/exploding gradients is a challenge when training deep networks. In this paper, we describe another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the depth of a neural network increases, the network's hidden nodes have more highly correlated behavior. This results in great similarities between these nodes. The redundancy of hidden nodes thus increases as the network becomes deeper. We call this problem vanishing nodes, and we propose the metric vanishing node indicator (VNI) for quantitatively measuring the degree of vanishing nodes. The VNI can be characterized by the network parameters, which is shown analytically to be proportional to the depth of the network and inversely proportional to the network width. The theoretical results show that the effective number of nodes vanishes to one when the VNI increases to one (its maximal value), and that vanishing/exploding gradients and vanishing nodes are two different challenges that increase the difficulty of training deep neural networks. The numerical results from the experiments suggest that the degree of vanishing nodes will become more evident during back-propagation training, and that when the VNI is equal to 1, the network cannot learn simple tasks (e.g. the XOR problem) even when the gradients are neither vanishing nor exploding. We refer to this kind of gradients as the walking dead gradients, which cannot help the network converge when having a relatively large enough scale. Finally, the experiments show that the likelihood of failed training increases as the depth of the network increases. The training will become much more difficult due to the lack of network representation capability.

gradient, neural network, node, (14 more...)

arXiv.org Machine Learning

1910.09745

Country:

Europe > Spain > Canary Islands (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning Best Practices: Activation Functions & Weight Initialization Methods -- Part 1

#artificialintelligenceMay-22-2019, 02:49:36 GMT

One of the reasons that Deep learning has become more popular in the past decade is better learning algorithms which have to lead to faster convergence or better performance of neural networks in general. Along with better learning algorithms, Introduction of better activation functions, and better initialization methods help us to create better neural networks. Note: This article assumes that the reader has a basic understanding of Neural Network, weights, biases, and backpropagation. In this article, we discuss some of the commonly used activation functions and weight initialization methods while training a deep neural network. To be more specific, we will be covering the following.

artificial intelligence, deep learning, machine learning, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Weight Initialization without Local Minima in Deep Nonlinear Neural Networks

Nitta, Tohru

arXiv.org Machine LearningJun-13-2018

In this paper, we propose a new weight initialization method called even initialization for wide and deep nonlinear neural networks with the ReLU activation function. We prove that no poor local minimum exists in the initial loss landscape in the wide and deep nonlinear neural network initialized by the even initialization method that we propose. Specifically, in the initial loss landscape of such a wide and deep ReLU neural network model, the following four statements hold true: 1) the loss function is non-convex and non-concave; 2) every local minimum is a global minimum; 3) every critical point that is not a global minimum is a saddle point; and 4) bad saddle points exist. We also show that the weight values initialized by the even initialization method are contained in those initialized by both of the (often used) standard initialization and He initialization methods.

artificial intelligence, initialization method, machine learning, (12 more...)

arXiv.org Machine Learning

1806.04884

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback